{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Lesson 2\n",
    "\n",
    "Python Basic, Lesson 2, v1.0.0, 2016.12 by David.Yi   \n",
    "Python Basic, Lesson 2, v1.0.1, 2017.02 modified by Yimeng.Zhang  \n",
    "\n",
    "\n",
    "### 上次内容要点\n",
    "\n",
    "* python 简介\n",
    "* 准备工作\n",
    "    * 使用标准的 python 和 IDLE\n",
    "    * anaconda 介绍\n",
    "    * jupyter 和 notebook介绍\n",
    "* 基本变量概念\n",
    "* print() 和 input() 用法\n",
    "* pycharm 用法介绍\n",
    "    \n",
    "### 本次内容要点\n",
    "\n",
    "* 循环语句 for 和 range() 用法\n",
    "* 常用数据类型 \n",
    "    * list 用法\n",
    "    * dict 用法\n",
    "    * tuple 用法\n",
    "* 随机数介绍\n",
    "* 举例\n",
    "    * 中文分词介绍\n",
    "    * 小程序练习（猜拳游戏）"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 循环语句 for...in\n",
    "\n",
    "Python的循环主要是for...in循环，依次把list或tuple等中的每个元素迭代出来；python 也有 while 循环，一般不常用。\n",
    "\n",
    "可以理解为，`for x in ...` 循环就是把每个元素代入变量x，然后执行缩进块的语句。\n",
    "\n",
    "如果是简单的按照次数的循环，一般用 range() 函数来产生一个可以生成迭代数字的序列"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "a\n",
      "b\n",
      "c\n",
      "d\n",
      "e\n",
      "f\n"
     ]
    }
   ],
   "source": [
    "# 按照字符串进行迭代循环\n",
    "s = 'abcdef'\n",
    "for i in s:\n",
    "    print(i)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "a\n",
      "b\n",
      "c\n"
     ]
    }
   ],
   "source": [
    "s = ['a', 'b', 'c']\n",
    "for i in s:\n",
    "    print(i)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0\n",
      "1\n",
      "2\n"
     ]
    }
   ],
   "source": [
    "for i in range(3):\n",
    "    \n",
    "    print(i)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 循环语句 while    \n",
    "while循环是在Python中的循环结构之一。 while循环继续，直到表达式变为假。      \n",
    "\n",
    "表达的是一个逻辑表达式，必须返回一个true或false值。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The count is: 0\n",
      "The count is: 1\n",
      "The count is: 2\n",
      "The count is: 3\n",
      "The count is: 4\n",
      "The count is: 5\n",
      "The count is: 6\n",
      "The count is: 7\n",
      "The count is: 8\n",
      "Good bye!\n"
     ]
    }
   ],
   "source": [
    "count = 0\n",
    "while (count < 9):\n",
    "    print('The count is:', count)\n",
    "    count += 1\n",
    "print(\"Good bye!\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "##### range() 函数\n",
    "\n",
    "range() 函数产生一个等差序列，range(x,y,z)，表示从 x 到 y（不含 y），z 为步长，可以为负。\n",
    "\n",
    "修改下面例子中的 range() 中的起始、结束和步长，来看看各种效果。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1\n",
      "4\n",
      "7\n"
     ]
    }
   ],
   "source": [
    "for i in range(1,10,3):\n",
    "    print(i)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0\n",
      "1\n",
      "2\n",
      "3\n",
      "4\n",
      "5\n",
      "6\n",
      "7\n",
      "8\n",
      "9\n"
     ]
    }
   ],
   "source": [
    "for i in range(10,2,-1):\n",
    "    print(i)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0 Mary \n",
      "1 had\n",
      "2 a\n",
      "3 little \n",
      "4 lamb\n"
     ]
    }
   ],
   "source": [
    "# 同时获得列表中的序号和内容，可以这样写\n",
    "# len() 是获得列表的长度，可以理解元素个数\n",
    "\n",
    "s = ['Mary ', 'had', 'a', 'little ', 'lamb']\n",
    "for i in range(len(s)):\n",
    "    print(i, s[i])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0 Mary \n",
      "1 had\n",
      "2 a\n",
      "3 little \n",
      "4 lamb\n"
     ]
    }
   ],
   "source": [
    "# 更加好的写法, 使用 enumerate\n",
    "\n",
    "s = ['Mary ', 'had', 'a', 'little ', 'lamb']\n",
    "for i, item in enumerate(s):\n",
    "    print(i, item)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 字符串\n",
    "\n",
    "字符串即有序的字符的集合，用来存储或表现基于文本的信息"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "spam\n"
     ]
    }
   ],
   "source": [
    "a = 'spam' # 单引号\n",
    "print(a)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "spam\n",
      "spam's log\n"
     ]
    }
   ],
   "source": [
    "a = \"spam\" # 双引号\n",
    "print(a)\n",
    "b = \"spam's log\"\n",
    "print(b)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "multile lines\n",
      "this is an example\n"
     ]
    }
   ],
   "source": [
    "# 多行字符串\n",
    "a = '''multile lines\n",
    "this is an example'''\n",
    "print(a)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "a\n",
      "b\tc\n"
     ]
    }
   ],
   "source": [
    "s = 'a\\nb\\tc'  # 转义字符串\n",
    "print(s)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "C:\n",
      "ew\text.txt\n",
      "C:\\new\\text.txt\n"
     ]
    }
   ],
   "source": [
    "# 原始字符串\n",
    "a = 'C:\\new\\text.txt'\n",
    "print(a)\n",
    "b = r'C:\\new\\text.txt'\n",
    "print(b)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "ABCabc.txt123\n",
      "ABCabc.txtABCabc.txt\n",
      "A\n",
      "t\n",
      "txt.cbaCBA\n",
      "AB\n",
      "10\n",
      "abcabc.txt\n",
      "ABCABC.TXT\n",
      "True\n",
      "True\n"
     ]
    }
   ],
   "source": [
    "# 常用的字符串表达式\n",
    "s = 'ABCabc.txt'\n",
    "print(s + '123')  # 字符串拼接\n",
    "print(s * 2)  # 重复\n",
    "print(s[0])  # 索引\n",
    "print(s[-1])\n",
    "print(s[::-1])  # 反转\n",
    "print(s[0:2]) # 切片（前闭后开）\n",
    "print(len(s))  # 长度\n",
    "print(s.lower()) # 小写转换\n",
    "print(s.upper()) # 大写转换\n",
    "print(s.endswith('.txt'))  # 后缀测试\n",
    "print('abc' in s) # 成员关系测试"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "---\n",
    "\n",
    "####  列表\n",
    "\n",
    "列表 list 是 python 内置的一种数据类型。list是一种可变、有序的集合，可以随时添加和删除其中的元素。\n",
    "列表在一般的 python 程序中是最常用的数据类型。\n",
    "\n",
    "列表的基本概念：\n",
    "* 创建列表\n",
    "* 访问列表中的元素\n",
    "* 增加元素\n",
    "* 删除元素\n",
    "* 排序\n",
    "* 多维\n",
    "* 保存和载入\n",
    "\n",
    "list 中除了保存一般的数字、字符以外，可以存储 python 中各种复杂的数据类型，包括 list 本身、对象等。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['pig', 'cat', 'dog']\n",
      "cat\n",
      "dog\n"
     ]
    }
   ],
   "source": [
    "# 直接创建列表\n",
    "\n",
    "a = ['pig', 'cat', 'dog']\n",
    "\n",
    "print(a)\n",
    "\n",
    "# 用序号访问列表中的元素，支持双向\n",
    "print(a[1])\n",
    "print(a[-1])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['bird']\n",
      "['bird', 'snake']\n",
      "['sheep', 'bird', 'snake']\n"
     ]
    }
   ],
   "source": [
    "# 列表初始化\n",
    "a = []\n",
    "\n",
    "# 列表末尾追加元素\n",
    "a.append('bird')\n",
    "print(a)\n",
    "a.append('snake')\n",
    "print(a)\n",
    "\n",
    "# 列表指定位置插入元素\n",
    "a.insert(0,'sheep')\n",
    "print(a)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['sheep', 'snake']\n"
     ]
    }
   ],
   "source": [
    "# 列表删除指定序号的元素\n",
    "a.pop(1)\n",
    "print(a)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['pig', 'cat']\n"
     ]
    }
   ],
   "source": [
    "# 列表删除指定内容的元素\n",
    "\n",
    "a = ['pig', 'cat', 'dog']\n",
    "a.remove('dog')\n",
    "print(a)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[4, 3, 2, 1]\n"
     ]
    }
   ],
   "source": [
    "# 列表排序\n",
    "\n",
    "a = ['pig', 'cat', 'dog', 'snake']\n",
    "# a = [1,2,3,4]\n",
    "a.sort(reverse = True)\n",
    "print(a)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['cat', 'dog', 'pig', 'snake', ['cat1', 'cat2', 'cat3']]\n"
     ]
    }
   ],
   "source": [
    "# 列表追加另一个列表\n",
    "# 使用 append 一个列表的话，这个列表会以一个元素的方式追加到原来列表\n",
    "\n",
    "b = ['cat1', 'cat2', 'cat3']\n",
    "a.append(b)\n",
    "print(a)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['pig', 'cat', 'dog', 'snake', 'cat1', 'cat2', 'cat3']\n"
     ]
    }
   ],
   "source": [
    "# 列表追加另一个列表的元素\n",
    "\n",
    "a = ['pig', 'cat', 'dog', 'snake']\n",
    "b = ['cat1', 'cat2', 'cat3']\n",
    "a.extend(b)\n",
    "print(a)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0\n"
     ]
    }
   ],
   "source": [
    "# 统计某个元素在列表中出现次数\n",
    "\n",
    "a = ['a','b','b','c']\n",
    "print(a.count('2'))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# 列表内容保存\n",
    "\n",
    "# 使用 pickle 模块\n",
    "import pickle\n",
    "\n",
    "f = open('list_dump.txt', 'wb')\n",
    "a = ['pig', 'cat', 'dog', 'snake', 'snake']\n",
    "pickle.dump(a, f)\n",
    "f.close()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['pig', 'cat', 'dog', 'snake', 'snake']\n"
     ]
    }
   ],
   "source": [
    "# 列表内容读出\n",
    "\n",
    "f = open('list_dump.txt', 'rb')\n",
    "a1 = pickle.load(f)\n",
    "f.close()\n",
    "print(a1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "#### pickle 序列化\n",
    "\n",
    "Python中可以使用 pickle 模块将对象转化为文件保存在磁盘上，在需要的时候再读取并还原。具体用法如下：\n",
    "\n",
    "```pickle.dump(obj, file[, protocol])```\n",
    "\n",
    "pickle 的格式和 python 版本等有关，不同版本之间不兼容。比较好的序列化方式是用 json 格式。\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[['pig', 'cat', 'dog'], [1, 2, 3]]\n",
      "cat\n",
      "3\n"
     ]
    }
   ],
   "source": [
    "# 两维列表\n",
    "\n",
    "a = ['pig', 'cat', 'dog']\n",
    "b = [1,2,3]\n",
    "c = []\n",
    "\n",
    "c.append(a)\n",
    "c.append(b)\n",
    "\n",
    "print(c)\n",
    "# 第0个元素的第1个元素\n",
    "print(c[0][1]) # cat\n",
    "\n",
    "# 第1个元素的第2个元素\n",
    "print(c[1][2]) # 3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['pig', 'cat']\n"
     ]
    }
   ],
   "source": [
    "# 列表生成式 初步\n",
    "# 过滤列表中的重复元素\n",
    "\n",
    "a = ['pig', 'cat', 'dog', 'dog']\n",
    "b = []\n",
    "for i in a:\n",
    "    if a.count(i)>=2:\n",
    "        b.append(i)\n",
    "#print(b)\n",
    "# 如何取出只出现过1次的元素？\n",
    "\n",
    "\n",
    "b = [x for x in a if a.count(x) == 1]\n",
    "print(b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "---\n",
    "\n",
    "####  字典\n",
    "\n",
    "字典是另一种可变容器模型，可存储任意类型对象。\n",
    "\n",
    "字典的每个键值(key=>value)对用冒号(:)分割，每个对之间用逗号(,)分割，整个字典包括在花括号({})中，格式如 `d = {key1 : value1, key2 : value2 }`\n",
    "\n",
    "字典中的 key 值不可以重复（定义后也没有办法重复）\n",
    "\n",
    "字典的几个特点：\n",
    "1. 查找和插入的速度极快，不会随着key的增加而变慢；   \n",
    "2. 需要占用大量的内存，内存浪费多。   \n",
    "而list相反：\n",
    "1. 查找和插入的时间随着元素的增加而增加；   \n",
    "2. 占用空间小，浪费内存很少。   \n",
    "\n",
    "* 创建字典\n",
    "* 访问字典中的 key-value\n",
    "* 修改字典中的 key-value\n",
    "* 获得字典中指定 key 的 value\n",
    "* 删除字典中的 key"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'Tom': 95, 'Tracy': 92, 'Jerry': 90}\n",
      "95\n"
     ]
    }
   ],
   "source": [
    "# 定义字典\n",
    "# 访问字典中的 key-value\n",
    "\n",
    "d = {'Tom': 95, 'Jerry': 90, 'Tracy': 92}\n",
    "print(d)\n",
    "print(d['Tom'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'Hugo': 85, 'Tom': 95, 'Tracy': 92, 'Jerry': 90}\n"
     ]
    }
   ],
   "source": [
    "# 字典增加元素\n",
    "\n",
    "d['Hugo'] = 85\n",
    "print(d)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'Hugo': 85, 'Tom': 97, 'Tracy': 92, 'Jerry': 90}\n",
      "True\n"
     ]
    },
    {
     "ename": "KeyError",
     "evalue": "'AAA'",
     "output_type": "error",
     "traceback": [
      "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[1;31mKeyError\u001b[0m                                  Traceback (most recent call last)",
      "\u001b[1;32m<ipython-input-33-298f9e8e146f>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m()\u001b[0m\n\u001b[0;32m      7\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m      8\u001b[0m \u001b[1;31m# 获得字典 key 的 value\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 9\u001b[1;33m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0md\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;34m'AAA'\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m     10\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m     11\u001b[0m \u001b[1;31m# 获得不存在的 key 的 value，默认值\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;31mKeyError\u001b[0m: 'AAA'"
     ]
    }
   ],
   "source": [
    "# 修改元素的值\n",
    "d['Tom'] = 97\n",
    "print(d)\n",
    "\n",
    "# 是否存在某个 key\n",
    "print('Tom' in d)\n",
    "\n",
    "# 获得字典 key 的 value\n",
    "print(d['AAA'])\n",
    "\n",
    "# 获得不存在的 key 的 value，默认值\n",
    "print(d.get('Tommy',80))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'Tracy': 92, 'Jerry': 90, 'Tom': 95}\n",
      "{'Tracy': 92, 'Jerry': 90}\n"
     ]
    }
   ],
   "source": [
    "# 字典删除 key\n",
    "\n",
    "d = {'Tom': 95, 'Jerry': 90, 'Tracy': 92}\n",
    "print(d)\n",
    "d.pop('Tom')\n",
    "print(d)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'id_21': 63, 'id_27': 81, 'id_26': 78, 'id_1': 3, 'id_24': 72, 'id_22': 66, 'id_4': 12, 'id_18': 54, 'id_5': 15, 'id_3': 9, 'id_11': 33, 'id_13': 39, 'id_9': 27, 'id_29': 87, 'id_23': 69, 'id_8': 24, 'id_6': 18, 'id_28': 84, 'id_10': 30, 'id_15': 45, 'id_7': 21, 'id_2': 6, 'id_0': 0, 'id_19': 57, 'id_12': 36, 'id_14': 42, 'id_20': 60, 'id_17': 51, 'id_25': 75, 'id_16': 48}\n",
      "30\n"
     ]
    }
   ],
   "source": [
    "# 获得字典长度\n",
    "\n",
    "d1 = {}\n",
    "for i in range(30):\n",
    "    d1['id_'+str(i)] = i*3\n",
    "print(d1)\n",
    "\n",
    "print(len(d1))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "---\n",
    "\n",
    "#### Tuple 元组\n",
    "\n",
    "Tuple 也是一种有序列表，在存储数据方面和 List 很相似\n",
    "Tuple 一旦内容存储后，就不能修改；这样的好处是数据很安全\n",
    "\n",
    "应用范围：在我们需要使用 list 功能的时候，但是又不需要改变这个 list 的内容，用 Tuple 元组功能会很安全，不用担心程序中不小心修改了其内容。函数传递多个参数时候，就是采用 tuple，保证参数在被调用的过程中的安全。\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "('Tom', 'Jerry', 'Mary')\n"
     ]
    }
   ],
   "source": [
    "# 创建元组\n",
    "\n",
    "t = ('Tom', 'Jerry', 'Mary')\n",
    "print(t)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Jerry\n"
     ]
    }
   ],
   "source": [
    "# 访问元组的元素\n",
    "\n",
    "print(t[1])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "ename": "AttributeError",
     "evalue": "'tuple' object has no attribute 'append'",
     "output_type": "error",
     "traceback": [
      "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[1;31mAttributeError\u001b[0m                            Traceback (most recent call last)",
      "\u001b[1;32m<ipython-input-8-e2bbceae4065>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m()\u001b[0m\n\u001b[0;32m      1\u001b[0m \u001b[1;31m# 元组创建后是不能修改的\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m      2\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 3\u001b[1;33m \u001b[0mt\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mappend\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m'Someone'\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[1;31mAttributeError\u001b[0m: 'tuple' object has no attribute 'append'"
     ]
    }
   ],
   "source": [
    "# 元组创建后是不能修改的\n",
    "\n",
    "t.append('Someone')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "ename": "TypeError",
     "evalue": "'tuple' object does not support item assignment",
     "output_type": "error",
     "traceback": [
      "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[1;31mTypeError\u001b[0m                                 Traceback (most recent call last)",
      "\u001b[1;32m<ipython-input-9-a9cd518cf863>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m()\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[0mt\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m1\u001b[0m\u001b[1;33m]\u001b[0m \u001b[1;33m=\u001b[0m \u001b[1;34m'aaa'\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[1;31mTypeError\u001b[0m: 'tuple' object does not support item assignment"
     ]
    }
   ],
   "source": [
    "t[1] = 'aaa'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(['A', 'B', 'C'], 100, 200)\n"
     ]
    }
   ],
   "source": [
    "# 创建复杂一点的元组\n",
    "\n",
    "l  = ['A', 'B', 'C']\n",
    "t =(l, 100, 200)\n",
    "print(t)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(['A', 'B', 'C'], 100, 200)\n",
      "['A', 'B', 'C', 'D']\n",
      "(['A', 'B', 'C', 'D'], 100, 200)\n"
     ]
    }
   ],
   "source": [
    "# 变通的实现\"可变\"元组内容\n",
    "print(t)\n",
    "l.append('D')\n",
    "print(l)\n",
    "print(t)  # tuple的每个元素，指向永远不变，但指向的元素本身是可变的"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'int'>\n",
      "<class 'tuple'>\n",
      "(1,)\n"
     ]
    }
   ],
   "source": [
    "# 创建只有1个元素的元组\n",
    "\n",
    "l = (1)\n",
    "print(type(l))  # l成了一个整数，因为这里的括号有歧义，被认作数学计算里的小括号\n",
    "\n",
    "# 1个元素的元组必须加逗号来消除歧义\n",
    "l = (1,)\n",
    "print(type(l))\n",
    "print(l)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "#### 随机数\n",
    "\n",
    "随机数这一概念在不同领域有着不同的含义，在密码学、通信领域有着非常重要的用途。\n",
    "\n",
    "Python 的随机数模块是 random，random 模块主要有以下函数，结合例子来看看。\n",
    "\n",
    "* random.choice()\n",
    "* random.sample() \n",
    "* random.random()\n",
    "* random.uniform()\n",
    "* random.randint()\n",
    "* random.shuffle()\n",
    "* random.sample"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "45\n"
     ]
    }
   ],
   "source": [
    "import random\n",
    "\n",
    "# random.choice从序列中获取一个随机元素。\n",
    "# 其函数原型为：random.choice(sequence)。参数sequence表示一个有序类型。\n",
    "\n",
    "print(random.choice(range(1,100)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "b\n"
     ]
    }
   ],
   "source": [
    "# 从一个列表中产生随机元素\n",
    "\n",
    "a = ['a', 'b', 'c']\n",
    "print(random.choice(a))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[35, 98, 29, 20, 31, 73, 22, 50, 91, 77]\n"
     ]
    }
   ],
   "source": [
    "# 创建指定范围内指定个数的整数随机数\n",
    "print(random.sample(range(1,100), 10))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "16\n"
     ]
    }
   ],
   "source": [
    "# random.randint(a, b)，用于生成一个指定范围内的整数。\n",
    "# 其中参数a是下限，参数b是上限，生成的随机数n: a <= n <= b\n",
    "\n",
    "print(random.randint(1,100))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "3\n"
     ]
    }
   ],
   "source": [
    "# random.randrange的函数原型为：random.randrange([start], stop[, step])，\n",
    "# 从指定范围内，按指定基数递增的集合中 获取一个随机数。\n",
    "\n",
    "print(random.randrange(1,10))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.5265744262403391\n"
     ]
    }
   ],
   "source": [
    "# random.random()用于生成一个0到1的随机符点数: 0 <= n < 1.0\n",
    "\n",
    "print(random.random())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "38.8422507789255\n",
      "35.132838608034035\n"
     ]
    }
   ],
   "source": [
    "# random.uniform的函数原型为：random.uniform(a, b)，\n",
    "# 用于生成一个指定范围内的随机符点数，两个参数其中一个是上限，一个是下限。\n",
    "# 如果a > b，则生成的随机数n: a <= n <= b。如果 a <b， 则 b <= n <= a。\n",
    "\n",
    "print(random.uniform(1,100))\n",
    "print(random.uniform(50,10))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[87, 1, 23, 5, 12]\n"
     ]
    }
   ],
   "source": [
    "# random.shuffle的函数原型为：random.shuffle(x[, random])，\n",
    "# 用于将一个列表中的元素打乱\n",
    "\n",
    "a = [12, 23, 1, 5, 87]\n",
    "random.shuffle(a)\n",
    "print(a)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[4, 0, 8, 5, 7]\n",
      "[0, 5, 3, 4, 8, 1, 9]\n"
     ]
    }
   ],
   "source": [
    "# random.sample的函数原型为：random.sample(sequence, k)，\n",
    "# 从指定序列中随机获取指定长度的片断。sample函数不会修改原有序列。\n",
    "\n",
    "print(random.sample(range(10),5))\n",
    "print(random.sample(range(10),7))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "python 程序举例"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Full Mode: 今天/ 天上/ 上海/ 的/ 天气/ 怎么/ 怎么样\n",
      "Default Mode: 明天/ 纽约/ 下雨/ 么\n",
      "现在, 天气, 怎么样\n",
      "小明, 硕士, 毕业, 于, 中国科学院, 计算所, ，, 后, 在, 日本京都大学, 深造\n",
      "小明, 硕士, 毕业, 于, 中国, 科学, 学院, 科学院, 中国科学院, 计算, 计算所, ，, 后, 在, 日本, 京都, 大学, 日本京都大学, 深造\n"
     ]
    }
   ],
   "source": [
    "import jieba\n",
    "\n",
    "# 全模式\n",
    "# 把句子中所有的可以称此的词语都扫描出来，速度非常快，但是不能解决歧义\n",
    "seg_list = jieba.cut(\"今天上海的天气怎么样\", cut_all = True)\n",
    "print(\"Full Mode: \" + \"/ \".join(seg_list))  \n",
    "\n",
    "# 精确模式\n",
    "# 试图将句子最精确的切开，适合文本分析\n",
    "seg_list = jieba.cut(\"明天纽约下雨么\", cut_all = False)\n",
    "print(\"Default Mode: \" + \"/ \".join(seg_list))  \n",
    "\n",
    "# 默认是精确模式\n",
    "seg_list = jieba.cut(\"现在天气怎么样\")  \n",
    "print(\", \".join(seg_list))\n",
    "\n",
    "# 默认是精确模式\n",
    "seg_list = jieba.cut(\"小明硕士毕业于中国科学院计算所，后在日本京都大学深造\")  \n",
    "print(\", \".join(seg_list))\n",
    "\n",
    "# 搜索引擎模式\n",
    "# 在精确模式的基础上，对长词再次切分，提高召回率，适合用于搜索引擎分词    \n",
    "seg_list = jieba.cut_for_search(\"小明硕士毕业于中国科学院计算所，后在日本京都大学深造\") \n",
    "print(\", \".join(seg_list))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2016,年,第一季度,支付,事业部,交易量,报表\n"
     ]
    }
   ],
   "source": [
    "seg_list = jieba.cut(\"2016年第一季度支付事业部交易量报表\")  # 默认是精确模式\n",
    "print(','.join(seg_list))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2016\n",
      "年\n",
      "第一季度\n",
      "支付\n",
      "事业部\n",
      "交易量\n",
      "报表\n"
     ]
    }
   ],
   "source": [
    "seg_list = jieba.cut(\"2016年第一季度支付事业部交易量报表\")  # 默认是精确模式\n",
    "for i in seg_list:\n",
    "    print(i)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "我 r\n",
      "爱 v\n",
      "北京 ns\n",
      "天安门 ns\n"
     ]
    }
   ],
   "source": [
    "import jieba.posseg as pseg\n",
    "words = pseg.cut(\"我爱北京天安门\")\n",
    "\n",
    "for word, flag in words:\n",
    "    print('%s %s' % (word, flag))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# 简单的剪刀石头布\n",
    "\n",
    "import random\n",
    "\n",
    "FIRST = 0\n",
    "SECOND = 1 \n",
    "BOTH = 2 \n",
    "\n",
    "t1 = ('剪刀', '石头', '布')\n",
    "t2 = ('human win', 'computer win', 'draw')\n",
    "\n",
    "def which_win(i1, i2):\n",
    "    if i1 == 0 and i2 == 1:\n",
    "        return SECOND\n",
    "    if i1 == 0 and i2 == 2:\n",
    "        return FIRST\n",
    "    if i1 == 1 and i2 == 0:\n",
    "        return FIRST\n",
    "    if i1 == 1 and i2 == 2:\n",
    "        return SECOND\n",
    "    if i1 == 2 and i2 == 0:\n",
    "        return SECOND\n",
    "    if i1 == 2 and i2 == 1:\n",
    "        return FIRST\n",
    "    if i1 == i2:\n",
    "        return BOTH\n",
    "\n",
    "print('0:剪刀 1:石头 2:布')\n",
    "human = int(input('你出了:'))\n",
    "\n",
    "c_index = random.randint(0,2)\n",
    "computer = t1[c_index]\n",
    "\n",
    "print(\"电脑出了\",computer)\n",
    "\n",
    "print(t2[which_win(human, c_index)])\n",
    "\n",
    "# 如何优化改进？\n",
    "# 如何设计记录5局3胜的功能？"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
